Enhancements to the Training Process of Classifier-Based Speech Translator via Topic Modeling

نویسندگان

  • Emil Ettelaie
  • Panayiotis G. Georgiou
  • Shrikanth S. Narayanan
چکیده

Classification of sentences based on their meaning (or concept) has been used as component in speech translation and spoken language understanding systems. Preparing training data for this type of classifiers is often a tedious task. In our previous work, we presented a method of clustering sentences as a step toward automated annotation of concepts. To measure the distance between two sentences, that method relied on the local lexical dependencies in their translations. In this work, we apply Topic Modeling to enhance the previously proposed distance metric so that it includes information from semantic associations among the words. Our experiments on the DARPA USC Transonics and BBN Transtac data sets show the advantage of incorporating this information as performance improvements in a set of clustering tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards unsupervised training of the classifier-based speech translator

Concept classification has been proven to be a useful translation method for speech-to-speech translation applications. However, preparing training data for classifier is a cumbersome task for human annotators. An unsupervised training method is introduced here that is based on utterance clustering. A technique to measure the distance between two utterances, based on the concepts they express, ...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Hierarchical classification for speech-to-speech translation

Concept classifiers have been used in speech to speech translation systems. Their effectiveness, however, depends on the size of the domain that they cover. The main bottleneck in expanding the classifier domain is the degradation in accuracy as the number of classes increase. Here we introduce a hierarchical classification process that aims to scale up the domain without compromising the accur...

متن کامل

Modeling the Relationship Between Affective Balance, Social Intelligence, and Speech Anxiety With the Mediating Role of Consequence Expectation Among Teacher-Training Students of Farhangian University

Background and Objectives: There are several predictor variables for speech anxiety (SA) among students. Affective balance (AB) and social intelligence (SI) are two main factors in this field. In this study, we assess the mediating role of consequence expectation (CE) among these variables. Accordingly, this study aims to explore a model of the relationships between AB, SI, and SA mediated by C...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011